Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Document approx_topk keyword. #15179

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jeschkies
Copy link
Contributor

What this PR does / why we need it:
This is a follow up #14243 and documents the new approx_topk query function.

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@jeschkies jeschkies requested a review from a team as a code owner November 28, 2024 14:49
@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Nov 28, 2024
Copy link
Contributor

@JStickler JStickler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[docs team]


## Probabilistic aggregation

LogQL also supports a probabilistic `topk` approximation that is a drop-in replacement when `topk` hits the maximum series limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current text assumes that users are already familiar with what topk is, but we don't really do a good job of defining it in the documentation.

Suggested change
LogQL also supports a probabilistic `topk` approximation that is a drop-in replacement when `topk` hits the maximum series limit.
The `topk` keyword lets you find the largest 1,000 elements in a data stream by sample size. When `topk` hits the maximum series limit, LogQL also supports using a probable approximation; `approx_topk` is a drop-in replacement when `topk` hits the maximum series limit.

```

It is only supported for instant queries and does not support grouping. It is useful when the cardinality of the inner
vector is too high, e.g. when it uses an aggregation by a structured metadata label.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
vector is too high, e.g. when it uses an aggregation by a structured metadata label.
vector is too high, for example, when it uses an aggregation by a structured metadata label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants